Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't panic for some errors in modelstate #42

Merged
merged 2 commits into from
Mar 30, 2023

Conversation

tjohnson31415
Copy link
Contributor

Motivation

If a model load timeout is hit a panic is generated in the modelStateManager:

panic: interface conversion: interface {} is nil, not *mmesh.LoadModelResponse

goroutine 1397 [running]:
github.com/kserve/modelmesh-runtime-adapter/model-serving-puller/server.(*modelStateManager).loadModel(...)
	/opt/app-root/src/model-serving-puller/server/modelstate.go:107
github.com/kserve/modelmesh-runtime-adapter/model-serving-puller/server.(*PullerServer).LoadModel(0xc00016c550, {0x12df4f8, 0xc0005fc960}, 0x485800)
	/opt/app-root/src/model-serving-puller/server/server.go:116 +0xad
github.com/kserve/modelmesh-runtime-adapter/internal/proto/mmesh._ModelRuntime_LoadModel_Handler({0x10219e0, 0xc00016c550}, {0x12df4f8, 0xc0005fc960}, 0xc0004602a0, 0x0)
	/opt/app-root/src/internal/proto/mmesh/model-runtime_grpc.pb.go:181 +0x170
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00054a1c0, {0x12f37d0, 0xc0006ca4e0}, 0xc000b326c0, 0xc00027daa0, 0x1a6f780, 0x0)
	/remote-source/deps/gomod/pkg/mod/google.golang.org/grpc@v1.49.0/server.go:1301 +0xb03
google.golang.org/grpc.(*Server).handleStream(0xc00054a1c0, {0x12f37d0, 0xc0006ca4e0}, 0xc000b326c0, 0x0)
	/remote-source/deps/gomod/pkg/mod/google.golang.org/grpc@v1.49.0/server.go:1642 +0xa2a
google.golang.org/grpc.(*Server).serveStreams.func1.2()
	/remote-source/deps/gomod/pkg/mod/google.golang.org/grpc@v1.49.0/server.go:938 +0x98
created by google.golang.org/grpc.(*Server).serveStreams.func1
	/remote-source/deps/gomod/pkg/mod/google.golang.org/grpc@v1.49.0/server.go:936 +0x294

In a couple of error cases in submitRequest, nil is returned as the first return value with the error. The code in loadModel and unloadModel always attempts to cast the value to a pointer to a response, but this will panic if attempting to convert nil.

Modifications

  • add a test to reproduce the panic
  • change the code to use a comma-ok type assertion instead of panicking

Result

The puller/adapter doesn't crash when a model load times out.

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
@ckadner ckadner requested review from njhill and removed request for pvaneck and chinhuang007 March 30, 2023 05:13
Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @tjohnson31415!

@kserve-oss-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: njhill, tjohnson31415

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@njhill
Copy link
Member

njhill commented Mar 30, 2023

/lgtm

@kserve-oss-bot kserve-oss-bot merged commit 3632b16 into kserve:main Mar 30, 2023
Jooho referenced this pull request in Jooho/modelmesh-runtime-adapter May 1, 2023
… - bug fixes

Signed-off-by: jooho <jlee@redhat.com>
Jooho referenced this pull request in red-hat-data-services/modelmesh-runtime-adapter May 16, 2023
#### Motivation

If a model load timeout is hit a panic is generated in the modelStateManager:
```
panic: interface conversion: interface {} is nil, not *mmesh.LoadModelResponse

goroutine 1397 [running]:
github.com/kserve/modelmesh-runtime-adapter/model-serving-puller/server.(*modelStateManager).loadModel(...)
	/opt/app-root/src/model-serving-puller/server/modelstate.go:107
github.com/kserve/modelmesh-runtime-adapter/model-serving-puller/server.(*PullerServer).LoadModel(0xc00016c550, {0x12df4f8, 0xc0005fc960}, 0x485800)
	/opt/app-root/src/model-serving-puller/server/server.go:116 +0xad
github.com/kserve/modelmesh-runtime-adapter/internal/proto/mmesh._ModelRuntime_LoadModel_Handler({0x10219e0, 0xc00016c550}, {0x12df4f8, 0xc0005fc960}, 0xc0004602a0, 0x0)
	/opt/app-root/src/internal/proto/mmesh/model-runtime_grpc.pb.go:181 +0x170
google.golang.org/grpc.(*Server).processUnaryRPC(0xc00054a1c0, {0x12f37d0, 0xc0006ca4e0}, 0xc000b326c0, 0xc00027daa0, 0x1a6f780, 0x0)
	/remote-source/deps/gomod/pkg/mod/google.golang.org/grpc@v1.49.0/server.go:1301 +0xb03
google.golang.org/grpc.(*Server).handleStream(0xc00054a1c0, {0x12f37d0, 0xc0006ca4e0}, 0xc000b326c0, 0x0)
	/remote-source/deps/gomod/pkg/mod/google.golang.org/grpc@v1.49.0/server.go:1642 +0xa2a
google.golang.org/grpc.(*Server).serveStreams.func1.2()
	/remote-source/deps/gomod/pkg/mod/google.golang.org/grpc@v1.49.0/server.go:938 +0x98
created by google.golang.org/grpc.(*Server).serveStreams.func1
	/remote-source/deps/gomod/pkg/mod/google.golang.org/grpc@v1.49.0/server.go:936 +0x294
```

In a couple of error cases in `submitRequest`, `nil` is returned as the first return value with the error. The code in `loadModel` and `unloadModel` always attempts to cast the value to a pointer to a response, but this will panic if attempting to convert `nil`.

#### Modifications

- add a test to reproduce the panic
- change the code to use a comma-ok type assertion instead of panicking

#### Result

The puller/adapter doesn't crash when a model load times out.


Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants